Performance differences among commercially available antigen rapid tests for COVID-19 in Brazil

A rapid and accurate diagnosis is a crucial strategy for containing the coronavirus disease (COVID-19) pandemic. Considering the obstacles to upscaling the use of RT–qPCR, rapid tests based on antigen detection (Ag-RDT) have become an alternative to enhance mass testing, reducing the time for a prompt diagnosis and virus spreading. However, the performances of several commercially available Ag-RDTs have not yet been evaluated in several countries. Here, we evaluate the performance of eight Ag-RDTs available in Brazil to diagnose COVID-19. Patients admitted to tertiary hospitals with moderate or mild COVID-19 symptoms and presenting risk factors for severe disease were included. The tests were performed using a masked protocol, strictly following the manufacturer’s recommendations and were compared with RT–qPCR. The overall sensitivity of the tests ranged from 9.8 to 81.1%, and specificity greater than 83% was observed for all the evaluated tests. Overall, slight or fair agreement was observed between Ag-RDTs and RT–PCR, except for the Ag-RDT COVID-19 (Acro Biotech), in which moderate agreement was observed. Lower sensitivity of Ag-RDTs was observed for patients with cycle threshold > 25, indicating that the sensitivity was directly affected by viral load, whereas the effect of the disease duration was unclear. Despite the lower sensitivity of Ag-RDTs compared with RT–qPCR, its easy fulfillment and promptness still justify its use, even at hospital admission. However, the main advantage of Ag-RDTs seems to be the possibility of increasing access to the diagnosis of COVID-19 in patients with a high viral load, allowing immediate clinical management and reduction of infectivity and community transmission.


Introduction
Initially described in China in December 2019, coronavirus disease , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), rapidly spread worldwide, becoming a major global health concern, and causing massive socioeconomic disruption [1,2]. Although the rapid development and availability of vaccines demonstrate the enormous technical-scientific response capacity, operationally, worldwide vaccination faces serious challenges to overcome the economic inequalities between countries and the crescent, but not recent, phenomenon of vaccine rejection. Furthermore, the mutagenic capacity of the virus and continuous emergence of variants make full control of the pandemic a goal not yet achieved. Accurate diagnosis tests for SARS-CoV-2 remain necessary for monitoring and containing new waves of COVID-19 by the early diagnosis of cases, minimizing opportunities for transmission.
Real-time reverse transcription polymerase chain reaction (RT-qPCR) using respiratory specimens has been recommended as a reference diagnostic test for acute SARS-CoV-2 infection and has been widely implemented [3]. Considering tests with high sensitivity and specificity, they are challenging to implement at scale, particularly in resource-poor settings: they are costly, time-spending and require laboratory infrastructure and highly trained technicians [4,5].
Antigen-based tests (Ag-RDTs) quickly emerged as a viable alternative to large-scale testing for SARS-CoV-2. Although they are inferior to RT-qPCR in terms of sensitivity and specificity [6,7], their potential advantages of ease of execution, low cost and short time until results, at the point of care without the need for a laboratory, make them tools of choice in favor of a timely decision-making process [8]. However, for Ag-RDT to fulfill this purpose, it is essential to recognize the performance of different tests and the impact of disease duration and other variables on their accuracy in real-life studies in different world regions. Significant differences between the accuracy of Ag-RDT reported by the manufacturers and that observed under field conditions have already been reported [9,10]. Most of these studies evaluated hospitalized patients with severe forms of COVID-19 and in the first days of symptoms, conditions with a presumed high viral load in respiratory specimens, which possibly favors test performance [11,12]. Herein, we aimed to evaluate in parallel the performance of eight Ag-RDTs commercially available in Brazil in patients presenting mild respiratory symptoms and COVID-19 suspicion.

Methods
This study was performed according to the Standards for Reporting of Diagnostic Accuracy (STARD) statement [13].

Study design, sample processing and diagnostic procedures
This prospective study was conducted in a tertiary hospital setting in the municipality of Belo Horizonte, state of Minas Gerais, Brazil. Patients presenting a clinical suspicion of COVID-19 were consecutively enrolled and tested on hospital admission by RT-qPCR for SARS-CoV-2. Only patients with a mild clinical condition who were eligible for noninvasive treatment were included in this study. For each recruited participant, two samples were collected through consecutive paired nasopharyngeal swabs, one from each nostril. Depending on the manufacturers' requirements, the rapid test was performed at the bedside immediately after collection (maximum of 30 minutes) or, if explicitly authorized, could be performed using the Universal Transport Medium (Copan UTM system; Copan, Italy; catalog no. 3C047N) in the reference laboratory at a maximum of 12 hours after sampling. Thus, using the UTM, from the clinical specimen obtained from a single participant it was possible to perform several tests, limited by the total volume of the transport solution available. The UTM volume was always limited to a maximum of 2 mL to minimize sample dilution. Therefore, whenever authorized by the manufacturer, the UTM was the preferred source of performing the test. Thus, considering these two tests execution possibilities, we developed a study schedule based on two phases. In the first phase, the swab used in the first nostril was kept in transport medium and, from this solution the RT-qPCR and four Ag-RDTs were performed (randomly chosen among the six Ag-RDTs UTM-enabled). The swab used in the second nostril was directly used in the execution of one Ag-RDT (randomly chosen among the two Ag-RDTs based on nasopharyngeal samples whose manufacturers did not allow execution with UTM). As the same way, in the second phase, the swab used in the first nostril was immersed in UTM, which was used in the execution of the RT-qPCR and the other two Ag-RDTs UTM-enabled. The swab used in the second nostril was directly used in the execution of one Ag-RDT, and the patients were submitted to a new collection, this time covering only the nasal mucosa for execution of a commercial test based on exclusive nasal swab collection (and not nasopharyngeal). The number of participants varied depending on the total patients recruited at each study phase, which ended when the minimum number of samples required (52 patients, according to the sample calculation) was reached at the end of the day. At the end of each study phase, depending on the availability of Ag-RDTs, two additional days of recruitment and testing strategy were maintained in order to gather some patients to be tested simultaneously using dry swab and UTM sources, a secondary analysis. During all the study, RT-qPCR results were masked to all researchers involved in the rapid test reading.

Sample calculation
For a comparison based on agreement with the reference test (RT-qPCR), the minimum population required for one test validation was estimated as 52 patients, according to Arifin et al. (2021) [14]. The premises were a power of 80%, a significance level of 5%, a minimum disease prevalence in the sample of 50% and a minimum acceptable Kappa of 0.5, with an expected Kappa of 0.8. Patients with undetermined RT-qPCR results were excluded from this analysis.

Selection and execution of antigen-detection rapid diagnostic tests (Ag-RDTs)
All manufacturers of the 58 Ag-RDTs for COVID-19 registered at the Brazilian National Health Surveillance Agency (ANVISA) until March 2021 were contacted and invited to participate in this validation study. Of the 16 tests whose manufacturers agreed to participate in this validation, eight were selected based on the availability of supplying the kits within 40 days and possibility of execution from transport medium using a minimum required volume. All the tests were performed strictly according to the manufacturer's instructions using the buffer provided in each kit or UTM. The main characteristics of the selected Ag-RDTs are shown in Table 1 and include information concerning UTM use.
For tests performed immediately after sampling, the nasopharyngeal swab was immersed in the buffer solution and then dripped onto the test plate. For tests performed using samples stored in UTM, this medium was diluted in the buffer, and the final solution was dripped in the appropriate place on the reagent strip. In both cases, after the recommended waiting time, the control and test bands were observed in the test membrane. The test was considered positive if the control band was reactive and any intensity band was observed at the test band.

RT-qPCR
RNA extraction was performed using a MagMax Viral/Pathogen Nucleic Acid Isolation Kit (ThermoFisher Scientific, Waltham, MA, USA) or a Chemagic Viral DNA/RNA H96 kit (Per-kinElmer, Waltham, MA, USA), and amplification was performed using a commercial rRT-qPCR kit (TaqPath COVID-19 CE-IVD RT-qPCR; ThermoFisher Scientific) containing ORF1ab, Nucleocapsid (N) and Spike (S) as target sequences for SARS-CoV-2. All RT-qPCRs were performed using QuantStudio 5 (ThermoFisher Scientific, Waltham, MA, USA). A cycle threshold (CT) value of 37 was designated the cutoff value for positive results. Amplifications in at least two target regions of SARS-CoV-2 were considered positive, and the absence of an amplification signal was considered negative. Any other RT-qPCR results were considered inconclusive.

Data analysis
Descriptive statistics were used to present the main characteristics of the population. We used the Shapiro-Wilk normality test to evaluate whether the data were normally distributed. Continuous variables were presented as medians and interquartile range (IQR 25-75%), and the Mann-Whitney U test was used to compare medians. Categorical variables, expressed as numbers (percentages), were compared by chi-squared test or Fischer's exact test as appropriate. The accuracy analyses of Ag-RDT tests were determined according to RT-qPCR result using MedCalc Software (Version 20.015). Based on a two-by-two contingency table, sensitivity was considered as the number of true positive patients on the Ag-RDTs divided by the total of positive patients on the RT-qPCR, and specificity was considered the number of true negative patients on the Ag-RDTs divided by the total of negative patients on the RT-qPCR. Finally, accuracy was determined by the number of RT-qPCR and Ag-RDT concordant results divided by the total number of tested patients. RT-qPCR was defined as the reference standard, and the agreement was calculated using the Kappa index and interpreted following the criteria of Landis and Koch (1977) [15] as follows: <0, no agreement; 0-0.2, slight agreement; 0.2-0.4, fair agreement, 0.4-0.6, moderate agreement; 0.6-0.8, substantial agreement; 0.8-1, almost perfect agreement. The sensitivity, specificity, accuracy and Kappa index were presented with 95% confidence intervals (95% CIs). For each Ag-RDT, complementary analyses were performed stratifying patients into days of symptoms (7 days) and the total CT mean using logistic regression with Minitab Statistical Software. Although the Ct mean gathering the three genes' targets has no diagnostic meaning, we have used this value as proxy for the magnitude of the total viral load in the sample, a mathematical strategy to correlate the viral load with test performances.  Table 2.

Ag-RDT results
Overall, the sensitivity of the Ag-RDTs ranged from 9.8 to 81.1%, and the specificity was higher than 83.3% for all evaluated tests ( Table 3). The agreement beyond chance expressed by the Kappa index demonstrated fair agreement (K �0.2 < 0.4) for most Ag-RDTs except for the Ag-RDT COVID-19 (Acro Biotech) test, for which moderate agreement was observed (K = 0.53) and for CORIS Bioconcept 1 Ag-RDT (Nanosens) that presented slight agreement (K = 0.04).
The agreement between the results of three commercial tests (COVID-19 Ag ECO teste (Eco Diagnostica), Coris Bioconcept Ag-RDT (Nanosens) and SARS-CoV-2 Ag-RDT (SD BIOSENSOR)) performed on the same patients using directly obtained respiratory secretions and MTU source is shown in (S1 Table) and was considered slight or fair.
The indirect method of quantifying viral load expressed by the Ct value, obtained by RT-qPCR, was correlated with Ag-RDT positivity. False negative Ag-RDT results were mostly observed in patients with high Ct values for all Ag-RDTs evaluated (t test; p<0.05; Fig 1a). For some Ag-RDTs, patients with more days of symptom onset had more false negative results (t test; p<0.05; Fig 1b). The greater sensitivity among patients with Ct<25 corroborates these findings (Table 4). Only for the Ag-RDT COVID-19 (Acro Biotech) test was there no statistically significant difference in sensitivity according to the Ct values. Regarding the days of symptom onset, a numerical sensitivity reduction in the second week of symptoms was observed for all Ag-RDTs, with a significant difference only for CorisBioconcept and Celler Wondfo tests.
Using logistic regression, the likelihood of a false negative Ag-RDT result was associated with the RT-qPCR Ct value (Fig 2). We can verify a reduction in the probability of Ag-RDT positivity with an increase in the Ct value, with 50% of Ag-RDTs expected to become positive at Ct values close to 25.

Discussion
Because the COVID-19 pandemic remains a worldwide health problem, Ag-RDTs have arisen as important alternative diagnostic methods to RT-qPCR, increasing the possibility of mass testing, decentralizing the diagnosis and shortening the time to a confirmed diagnosis [8]. Several Ag-RDTs have been developed by various manufacturers from multiple countries and have been evaluated independently by researchers. However, most validation studies have evaluated the same tests mainly in Europe [16][17][18], highlighting the need for studies in countries where they are commercialized, because the population characteristics and different circulating virus variants may affect their performance [19]. Here, we evaluated the performance of eight Ag-RDTs for COVID-19 currently available in Brazil in hospitalized patients with moderate or mild disease and risk factors for severe disease requiring close monitoring. According to the Target Product Profiles (TPP) published by WHO, Ag-RDTs should achieve � 80% sensitivity and � 97% specificity compared with a nucleic acid amplification test [20]. Notably, the tests evaluated here presented sensitivity ranging from 9.8 to 81.1% and specificity close to 100%, and none accomplished the recommended performance. The low sensitivity reported here may be associated with the disease stage of the patients included, in most cases in the second week of illness. However, the highest sensitivity obtained up to 7 days of symptoms was 83.3% (Table 4). The recruitment strategy in a tertiary infectious disease hospital explains the inclusion of patients in the second week of illness, a period in which clinical manifestations usually require monitoring and medical support but are characterized by decreasing viral shedding. Conversely, considering the Ct values of RT-qPCR as an indirect reference for viral load in SARS-CoV-2 infections [18], a lower sensitivity was observed for patients presenting a Ct greater than 25 in all Ag-RDTs, indicating that the sensitivity was directly affected by viral load and indirectly affected by disease length. These observations reinforce that the ideal period for the use of rapid tests may be at least until the first seven days of illness. However, the need for judicious allocation of patients in the hospital environment and diagnostic opportunity represented by attendance at the health unit when clinical symptoms intensified in the second week of illness justify the interest in evaluating the performance of antigen-based tests in this population. The performance of Ag-RDTs may be related to the intrinsic characteristics of the patients, such as the viral load, disease severity and length of symptoms as well as the characteristics of the tests, quality of the specimen and proper handling [21,22]. Overall, a reduced sensitivity was observed for patients presenting more than seven days of onset symptoms. However, the test with the highest sensitivity (Ag-RDT COVID-19-Acro Biotech) exhibited the same performance regardless of the number of days with symptoms, suggesting a differentiated sensitivity capable of overcoming the reduction of viral load.
Regarding sample processing, several Ag-RDT manufacturers allow the use of UTM for the temporary storage of nasopharyngeal samples. Besides the additional time gained between the sample collection and testing, the transport medium uses the same sample for the execution of different tests, which could be a strategy to allow the execution of RT-qPCR on the same specimen already collected in the case of a negative Ag-RDT test.
Even performed in accordance with the manufacturer's instructions, one possible limitation of this strategy would be the potential of the transport medium to influence the performance of the test by promoting dilution of the clinical specimen. Here, the comparison between results obtained with the same test but performed using dry swab or UTM source revealed a worryingly weak agreement. Similar results were described by Cubas-Atienzar et al. (2021), who observed a lower sensitivity analytical limit of detection for Ag-RDT using UTM than dry swabs [23].
All evaluated Ag-RDTs are based on a sandwich immunodetection methodology with intrinsic characteristics that may affect their performance. Only two of the test manufacturers (CORIS Bioconcept Ag-RDT (Nanosens) and Panbio™ Ag-RDT Device Nasal (Abbott Rapid) clearly stated that the virus nucleocapsid (N) protein was used as the specific antigenic target. This structural protein is often used because of its relative abundance and because it presents the least amount of variation in the gene sequence, indicating that it is a stable protein [16,24]. However, the presence of mutations in SARS-CoV-2 altering expression of viral proteins may potentially impact Ag-RDT performance and accuracy results in scenarios of genetic variability should be interpreted with caution. During this study, P.1 and P.2 were the variants prevalent in Brazil. Repeated validations are required in order to verify the ability of Ag-RDTs to diagnose the current circulating strains [25,26].
Several limitations may affect the test's accuracy. This study was conducted under the same controlled laboratory conditions (temperature and lighting). Samples were collected, tests were performed by trained staff, and a unique Ag-RDT batch was used throughout the study. The total sample of patients to be included was supported by sample calculation; however, subgroup analysis should be interpreted carefully considering the heterogeneity of the population. Although Ag-RDTs presented lower sensitivity than RT-qPCR, those tests may be a useful diagnostic tool for COVID-19, rapidly detecting patients with high viral loads. These results confirm that the performance of rapid tests based on the antigen search for SARS-COV-2 in the routine of laboratories or health services may be inferior to that described by the manufacturers and that marked differences exist between commercial brands. Viral load seems to be the main determinant of test positivity, explaining the influence of symptom duration on observed performance. Even so, a positive Ag-RDT remains useful to diagnose symptomatic cases at hospital admission, particularly in terms of the speed of results, considering that a negative result does not rule out SARS-CoV-2 infection. The main benefit of Ag-RDTs would be to confirm the COVID-19 diagnosis in patients with higher viral shedding and possibly greater infectivity, reducing the number of cases for RT-qPCR. Thus, diagnostic algorithms combining tests with different methodologies must be evaluated in cost-effectiveness studies to confirm the best strategy for using rapid tests. Therefore, the various tests must be performed in different populations, justifying further studies in real-life scenarios, such as this one.